Evaluation of the psychometric properties of the family adaptability and cohesion scale (FACES III) through item response theory models in students from Chile and Colombia

Background A psychometric study of the Family Adaptability and Cohesion Scale (FACES III) has been conducted in Spanish-speaking countries from the perspective of the classical test theory. However, this approach has limitations that affect the psychometric understanding of this scale. Objective Accordingly, this study used the item response theory to investigate the psychometric performance of the items. Furthermore, it evaluated the differential performance of the items for Colombia and Chile. Method For this purpose, 518 health science students from both countries participated. Confirmatory Factor Analysis was used. Results The study results revealed that the cohesion and adaptability items presented adequate discrimination and difficulty indices. In addition, items 5, 8, 13, 17, and 19 of cohesion indicated differential functioning between students from both countries, with Chilean students exhibiting a greater discriminatory power. Further, the Colombian group exhibited a greater discriminatory power for item 18 of adaptability. Conclusions The study concluded that the items of FACES III indicated adequate psychometric performance in terms of their discriminative capacity and difficulty in Chile and Colombia.


Introduction
Family functioning is a widely studied construct, where its importance for developing and maintaining mental health indicators in individuals has been demonstrated [1,2].Few studies have revealed that deficient family functioning is related to emotional problems such as anxiety and depression [3][4][5].By contrast, more positive family functioning may favor better adjustment in youth [6][7] and lower psychological problems [8].In this context, family functioning plays an essential role in the beginning of the university stage, since it favors better adaptation and coping with the demands of academic life [9].In addition, a study conducted in China with medical students revealed that adequate family functioning is related to a lower presence of symptoms of depression and anxiety [10].Similarly, another study conducted in the same country on medical, nursing, and medical technology students reported that good family functioning is associated with decreased risks of distress and stress [11].Furthermore, another study conducted in the United States with nursing students revealed that better family functioning is related to lower stress, anxiety, and depression [12].In Nigeria, a study conducted with health sciences students revealed that negative family functioning is associated with a higher level of depression [13].In Latin America, a study conducted with Colombian medical students demonstrated that deficient family functioning is a risk factor for psychological distress [9].Accordingly, another study conducted in the same country indicated that family functioning is a predictor of academic achievement [14].A study conducted in Chile reported that family functioning is a protective factor against risk behaviors in students [15].Therefore, adequately measuring family functioning in health science students is crucial.In relation to this, the Family Adaptability and Cohesion Scale (FACES) is most widely used to study this construct, of which different versions have been developed.Among them, FACES III is the most used, enabling a linear assessment of family functioning from the circumplex model [16].
Numerous studies conducted in Latin America have examined the psychometric performance of FACES III.In Argentina, a confirmatory factor analysis (CFA) was used to examine the factorial structure of the scale [17].In Mexico, an exploratory factor analysis (EFA) was used to examine the psychometric performance of the scale [18].In Chile, the EFA approach was used to examine the factorial structure of the scale [19].A second-order CFA was used in another study conducted in the same country [20].In Peru, a combination of EFA and CFA was used to examine scale performance [21].Similarly, another study used EFA and CFA to examine the cohesion and adaptability dimensions of FACES III [22].However, the evidence has not presented psychometric studies in the countries mentioned, extrapolating to Latin America that analyzes the internal structure of FACES III.
As indicated in previous studies, all approaches have been based on the classical test theory (CTT), such as EFA and CFA.However, the CTT has severe limitations [23]: (a) lack of invariance of the results with respect to the instrument used and (b) lack of invariance of the psychometric properties of the tests with respect to the group used to calculate them.Therefore, given the above findings, it can be explained why the factorial structure of FACES III is not the same in the different studies that have analyzed its structure.Accordingly, item response theory (IRT) presents three fundamental advantages [24]: (a) Invariance of the item parameters, i.e., the item parameters do not vary, even if the respondents differ; (b) invariance of the trait parameter of the respondent concerning the instrument used to calculate it, i.e., the ability level of the respondent does not depend on the test; and (c) provision of local measures of accuracy through the item information curve (IIC) and the test information curve (TIC).These features provide a detailed knowledge of the area in which the trait measured by the test is best being measured.In other words, it enables us to know for which level of the trait the instrument is best designed.In addition, it enables us to examine the differential analysis of the items between groups.This ensures more reliable comparisons to be made between those evaluated.
For all these reasons, the general objective of this research is to study the psychometric functioning of FACES III using Item Response Theory (IRT).Specifically, (a) the degree of discrimination and difficulty of the FACES III items will be studied, and (b) the differential functioning of the items between students from Colombia and Chile will be evaluated.

Design
The present study used an instrumental design since the psychometric performance of a measurement instrument was analyzed [25].

Participants
The study involved 518 physiotherapy and kinesiology students from universities in Colombia (Universidad Simón Bolívar) and Chile (Universidad de Atacama).Table 1 presents that the average age of participants living in Colombia is 20.1 (SD = 3.4) and that of participants in Chile is 21.8 (SD = 4.0).As indicated in the table, both countries comprise a higher proportion of women (Colombia = 84.1%;Chile = 53.4%)than that of men (Colombia = 15.9%;Chile = 46.6%).In addition, 63.1% are physiotherapy major, and 36.9%belong to the kinesiology major.Finally, both countries constitute students from different academic years.

Procedure
For the study, approval was obtained from the ethics committee of the Universidad de San Sebastián, Chile (Resolution N° 2/2015 and N° 83/ 2020), and the standards established in the Helsinki declaration were followed [26].The data were obtained in November 2019, and the collection process was the same for both countries.
For data collection, non-probabilistic convenience sampling was used.A virtual form was applied in classrooms.In the online form, informed consent was presented first.Followed by the study objectives and contact information for the study coordinators.The students acquired access to FACES III questions only after providing informed consent.During the data collection process, data confidentiality and the opportunity to withdraw from the evaluation at any time were ensured.

Data analysis
First, compliance with the main assumptions of the IRT was evaluated.A separate graded response model (GRM) was fitted for each dimension of FACES III to meet the unidimensionality assumption.The G2 index [27], specifically Cramer's V coefficient, which takes values between − 1 and 1, was used to evaluate the assumption of local independence of the items [28].A large absolute value indicates a potential case of local dependence [29].Compliance with the monotonicity assumption was also inspected using the raw residue plots [30].
A GRM [31], specifically an extension of the 2-parameter logistic model (2-PLM) for ordered polytomous items [32], was used to calculate the IRT models.The C2 test developed for ordinal items [33] was used to calculate the model fit.The following fit criteria were used: Root mean square error of approximation, RMSEA ≤ 0.06 [34] and Standardized root mean square residual, SRMSR ≤ 0.05 [35].Comparative Fit Index (CFI) and Tucker-Lewis Index (TLI) values were also considered using the same fit criterion (≥ 0.95) employed in SEM models [36].The generalized S-X2 index and its corresponding RMSEA were used as a measure of effect size to assess item fit [37].
In the GRM models, two types of parameters were calculated: discrimination (a) and difficulty (b).The discrimination parameter determines the slope at which item responses change as a function of the level in the latent trait.The item difficulty parameters determine how much of the latent trait the item requires to be answered.As the scale comprises five response categories, there are four difficulty calculations, one per threshold.The calculations for these four thresholds indicate the level of the latent variable at which an individual has a 50% chance of scoring at or above a particular response category.The following graphs representing item and test performance for each latent trait were also calculated: item characteristic curve (ICC), test characteristic curve (TCC), IIC, and TIC.
We used the likelihood ratio approach for ordinal items to assess differential item functioning (DIF) according to the participants' country [29].Under this approach, two models were calculated: (a) a no-DIF model, where all item parameters are invariant between groups, and (b) another DIF model, where item parameters can be unequal between groups.The no-DIF (reduced model) and DIF (full model) models were compared using the log-likelihood ratio test with the ANOVA function to test for possible differences in item parameters between groups.In this comparison, the null hypothesis establishes that there is no DIF, i.e., the parameters of the items are equal between countries.A p value < 0.05 was used to reject the null hypothesis.
The 'mirt' and 'ltm' packages [28,38] were used to calculate the GRM models and DIF analysis.The RStudio environment [39] for R [40] was used in all cases.

Descriptive analysis of the items
Table 2 indicates that in the cohesion dimension, item 13 ('family members support each other in difficult times') presents the highest average score (M = 3.58).That is, most participants agree with this statement.In the adaptability dimension, item 9 ('family members are free to express themselves') presents the highest average score (M = 3.46).That is, most participants agree with this statement.On the other hand, it is observed that the asymmetry and kurtosis of the items show a distribution moderately different from a normal distribution.(As < ± 2; Ku < ± 7) [41].Furthermore, the response categories of all the items have been answered by the participants.
In addition, the raw residual plots for the items of both dimensions did not indicate a substantial deviation from monotonicity.
Regarding the parameters of the GRM cohesion model, Table 3 indicates that all the discrimination parameters of the items are above the value of 1.35, generally considered a high level of discrimination [42].Furthermore, Table 3 demonstrates that the adaptability items present adequate discrimination indexes (> 1.35), except for items 2 (1.18) and 3 (1.14)indicating moderate levels.Concerning the difficulty parameters, all threshold estimators increased monotonically.That is, a more significant presence of the latent trait is required to answer the higher response categories.Figure 1 depicts that the response alternatives of the items are monotonically related to the levels of cohesion and adaptability, respectively.That is, as one moves from left to right in the ICCs, the probability of choosing a response category increase and then decreases as responses move to the next higher category.
Figure 2 demonstrates a sharp increase in the total scores of FACES III as the actual level of cohesion and adaptability increases.Figure 3 depicts the IIC and TIC.For cohesion in the IIC, items 6 and 7 are the most accurate items of the scale for assessing the latent trait.In addition, the TIC indicates that the test is more reliable (accurate) in the scale range between − 2.5 and 0.5.Regarding adaptability, the IIC indicates that items 11 and 19 are the most accurate items of the scale for assessing the latent trait.In addition, the TIC indicates that the test is more reliable (accurate) in the range of the scale between − 2.5 and 0.5.

Differential item functioning (DIF)
Table 4 depicts the differential analysis of the items between the Chilean and Colombian participants.For cohesion, items 19, 5, 17, 8, and 13 present a differential performance between both groups, with a greater discriminatory power for students from Chile.However, with regard to item 13, the Colombian group has a greater discriminatory power.For the other cohesion items, the ANOVA analysis indicates no presence of DIF (p > .05).With respect to adaptability, only item 18 shows a differential performance between the groups, with a greater discriminatory power for the Colombian group.These differences can also be observed in the ICCs of these items (see Fig. 4).

Discussion
This study evaluated the performance of FACES III items based on IRT.A CFA was performed for each dimension separately to comply with the assumption of one-dimensionality, indicating that each factor presents adequate fit indices to the data.This approach is similar to what has been done in other studies [43,44], including for those studies on FACES III [22].The local independence of the items and the fulfillment of the monotonicity assumption were also demonstrated, all of which guarantee the veracity of the calculations made [45].
Regarding the cohesion factor, all the items presented adequate discrimination indexes, showing that they allow us to adequately differentiate the responses of people with different levels of family cohesion.Items 11 and 19, which presented the highest discrimination parameters, refer to the preference for sharing leisure time with the family and that the family unit is essential.This is expected, as family participation in leisure activities has been suggested to have a significant relation with family quality of life [46] and family cohesion [47].In addition, personal beliefs about the family unit are shown to be linked to a higher level of family resilience [48].In the adaptability factor, all its items allow us to adequately differentiate the responses of people with different levels of family adaptability.Especially items 7 and 6, which presented the highest discrimination parameters, refer to two essential aspects: (a) when faced with a problem, the family usually negotiates to find a solution, and (b) the children's opinion is taken into account to develop family discipline guidelines.In relation to this, seeking a consensus among family members to face a problem favors family adaptation to new contexts [49].In addition, assertive family discipline favors better coexistence and family adaptability [50].Regarding the difficulty indexes, the items of both factors indicated increasing monotonic values, i.e., people with low levels of cohesion and adaptability choose the first or second category.
As they have a higher level of the trait, they will choose higher categories.This pattern reflects the fact that the content of each of the 20 items makes it possible to take advantage of all the response alternatives and that there is no loss of information.In relation to DIF, in the cohesion factor, items 5, 8, 13, 17 and 19 presented DIF between university students from Colombia and Chile, with a greater discriminatory power for the Chilean group.These items better distinguish low and high levels of cohesion in Chilean students.Furthermore, it indicates that Colombian students require a higher level of cohesion to answer the higher categories in items 19, 17, and 13.By contrast, Chilean students need a higher level of the latent trait to answer the higher categories in items 1, 5, and 8. Regarding the adaptability factor, only item 18 indicated a differential performance between both countries, with a greater discriminatory power for Colombian students.Considering that the social and cultural aspect is closely linked to family functioning [51], the cultural, economic, and educational differences of both countries could cause different interpretations of the family functioning items, especially in the items presenting DIF.
This study has several limitations.First, a non-probabilistic convenience sample was used, limiting the generalizability of the results to both countries.In addition, both groups had a higher predominance of women and young participants (< 30 years).Further, differences were observed in terms of sample sizes between countries, with Colombia having larger sample sizes.Therefore, future studies should use probability sampling techniques and larger and more representative samples for both countries.Third, DIF was not assessed according to sex   Despite these limitations, the items of FACES III present adequate psychometric performance both in terms of their discriminative capacity and difficulty.Therefore, the items provide helpful information on levels of cohesion and adaptability, thereby allowing a better understanding of family functioning in health science students.Notably, this is the first study to show evidence of the psychometric performance of FACES-III through the IRT in Chile and Colombia.Moreover, as indicated in the study, some items of cohesion and adaptability present differential functioning between the two countries.These results should be considered when making cross-cultural comparisons of family functioning in health sciences students in Colombia and Chile.

Fig. 1
Fig. 1 Item characteristic curve of FACES III

Fig. 3 Fig. 2
Fig. 3 Item and test information curves for FACES III and age due to the sample size.Therefore, future studies should perform a DIF analysis for these groups.

Table 1
Demographic characteristics of the participants

Table 2
Descriptive analysis of the items and response rate of the items

Table 3
Parameters of the items of the GRM models for the dimensions of FACES III Note. a = discrimination parameters; b = difficulty parameters